AITopics | master node

Distributed Saddle-Point Problems Under Similarity

Neural Information Processing SystemsApr-25-2026, 16:02:47 GMT

The local functions at each node are assumed to be similar, due to statistical data similarity or otherwise. We establish lower complexity bounds for a fairly general class of algorithms solving the SPP. We show that a given suboptimality > 0 is achieved over master/workers networks in /µ log(1/") rounds of communications, where > 0 measures the degree of similarity of the local functions, µ is their strong convexity constant, and is the diameter of the network. The lower communication complexity bound over mesh networks reads 1/ p /µ log(1/"), where is the (normalized) eigengap of the gossip matrix used for the communication between neighbouring nodes. We then propose algorithms matching the lower bounds over either types of networks (up to log-factors). We assess the effectiveness of the proposed algorithms on a robust regression problem.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe > Russia (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework

Neural Information Processing SystemsFeb-18-2026, 04:19:55 GMT

Coded computing has emerged as a promising framework for tackling significant challenges in large-scale distributed computing, including the presence of slow, faulty, or compromised servers. In this approach, each worker node processes a combination of the data, rather than the raw data itself. The final result then is decoded from the collective outputs of the worker nodes. However, there is a significant gap between current coded computing approaches and the broader landscape of general distributed computing, particularly when it comes to machine learning workloads. To bridge this gap, we propose a novel foundation for coded computing, integrating the principles of learning theory, and developing a framework that seamlessly adapts with machine learning applications. In this framework, the objective is to find the encoder and decoder functions that minimize the loss function, defined as the mean squared error between the estimated and true values. Facilitating the search for the optimum decoding and functions, we show that the loss function can be upper-bounded by the summation of two terms: the generalization error of the decoding function and the training error of the encoding function. Focusing on the second-order Sobolev space, we then derive the optimal encoder and decoder.

artificial intelligence, enc, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Africa > Sudan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Information Technology (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

3d03b03197666b19c6a6e69812dd3e34-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 13:47:17 GMT

artificial intelligence, gaussian, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

44e65d3e9bc2f88b2b3d566de51a5381-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 10:17:19 GMT

algorithm, arxiv preprint arxiv, complexity, (12 more...)

Neural Information Processing Systems

Country:

Asia > Russia (0.04)
North America > United States (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework

Neural Information Processing SystemsOct-10-2025, 16:40:03 GMT

Coded computing has emerged as a promising framework for tackling significant challenges in large-scale distributed computing, including the presence of slow, faulty, or compromised servers. In this approach, each worker node processes a combination of the data, rather than the raw data itself. The final result then is decoded from the collective outputs of the worker nodes. However, there is a significant gap between current coded computing approaches and the broader landscape of general distributed computing, particularly when it comes to machine learning workloads. To bridge this gap, we propose a novel foundation for coded computing, integrating the principles of learning theory, and developing a framework that seamlessly adapts with machine learning applications. In this framework, the objective is to find the encoder and decoder functions that minimize the loss function, defined as the mean squared error between the estimated and true values. Facilitating the search for the optimum decoding and functions, we show that the loss function can be upper-bounded by the summation of two terms: the generalization error of the decoding function and the training error of the encoding function. Focusing on the second-order Sobolev space, we then derive the optimal encoder and decoder.

computation, computing, enc, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Africa > Sudan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Information Technology (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

DOGS: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus

Neural Information Processing SystemsOct-9-2025, 23:54:44 GMT

The most recent 3DGS method focuses either on improving the instability of rendering efficiency or reducing the model size. On the other hand, the training efficiency of 3DGS on large-scale scenes has not gained much attention.

computer vision, computer vision and pattern recognition, gaussian, (12 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Cost-effective Deep Learning Infrastructure with NVIDIA GPU

Ghimire, Aatiz, Alam, Shahnawaz, Giri, Siman, Ghimire, Madhav Prasad

arXiv.org Artificial IntelligenceMar-14-2025

The growing demand for computational power is driven by advancements in deep learning, the increasing need for big data processing, and the requirements of scientific simulations for academic and research purposes. Developing countries like Nepal often struggle with the resources needed to invest in new and better hardware for these purposes. However, optimizing and building on existing technology can still meet these computing demands effectively. To address these needs, we built a cluster using four NVIDIA GeForce GTX 1650 GPUs. The cluster consists of four nodes: one master node that controls and manages the entire cluster, and three compute nodes dedicated to processing tasks. The master node is equipped with all necessary software for package management, resource scheduling, and deployment, such as Anaconda and Slurm. In addition, a Network File Storage (NFS) system was integrated to provide the additional storage required by the cluster. Given that the cluster is accessible via ssh by a public domain address, which poses significant cybersecurity risks, we implemented fail2ban to mitigate brute force attacks and enhance security. Despite the continuous challenges encountered during the design and implementation process, this project demonstrates how powerful computational clusters can be built to handle resource-intensive tasks in various demanding fields.

artificial intelligence, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

2503.11246

Country:

Asia > Nepal > Bagmati Province > Kathmandu District > Kathmandu (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.34)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

General Coded Computing: Adversarial Settings

Moradi, Parsa, Akbarinodehi, Hanzaleh, Maddah-Ali, Mohammad Ali

arXiv.org Artificial IntelligenceFeb-11-2025

Conventional coded computing frameworks are predominantly tailored for structured computations, such as matrix multiplication and polynomial evaluation. Such tasks allow the reuse of tools and techniques from algebraic coding theory to improve the reliability of distributed systems in the presence of stragglers and adversarial servers. This paper lays the foundation for general coded computing, which extends the applicability of coded computing to handle a wide class of computations. In addition, it particularly addresses the challenging problem of managing adversarial servers. We demonstrate that, in the proposed scheme, for a system with $N$ servers, where $\mathcal{O}(N^a)$, $a \in [0,1)$, are adversarial, the supremum of the average approximation error over all adversarial strategies decays at a rate of $N^{\frac{6}{5}(a-1)}$, under minimal assumptions on the computing tasks. Furthermore, we show that within a general framework, the proposed scheme achieves optimal adversarial robustness, in terms of maximum number of adversarial servers it can tolerate. This marks a significant step toward practical and reliable general coded computing. Implementation results further validate the effectiveness of the proposed method in handling various computations, including inference in deep neural networks.

artificial intelligence, machine learning, worker node, (17 more...)

arXiv.org Artificial Intelligence

2502.08058

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

General Coded Computing in a Probabilistic Straggler Regime

Moradi, Parsa, Maddah-Ali, Mohammad Ali

arXiv.org Artificial IntelligenceFeb-1-2025

Coded computing has demonstrated promising results in addressing straggler resiliency in distributed computing systems. However, most coded computing schemes are designed for exact computation, requiring the number of responding servers to exceed a certain recovery threshold. Additionally, these schemes are tailored for highly structured functions. Recently, new coded computing schemes for general computing functions, where exact computation is replaced with approximate computation, have emerged. In these schemes, the availability of additional results corresponds to more accurate estimation of computational tasks. This flexibility introduces new questions that need to be addressed. This paper addresses the practically important scenario in the context of general coded computing, where each server may become a straggler with a probability $p$, independently from others. We theoretically analyze the approximation error of two existing general coded computing schemes: Berrut Approximate Coded Computing (BACC) and Learning Theoretic Coded Computing (LeTCC). Under the probabilistic straggler configuration, we demonstrate that the average approximation error for BACC and LeTCC converge to zero with the rate of at least $\mathcal{O}(\log^3_{\frac{1}{p}}(N)\cdot{N^{-3}})$ and $\mathcal{O}(\log^4_{\frac{1}{p}}(N)\cdot{N^{-2}})$, respectively. This is perhaps surprising, as earlier results does not indicate a convergence when the number of stragglers scales with the total number of servers $N$. However, in this case, despite the average number of stragglers being $Np$, the independence of servers in becoming stragglers allows the approximation error to converge to zero. These theoretical results are validated through experiments on various computing functions, including deep neural networks.

artificial intelligence, enc, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.00645

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback

Review for NeurIPS paper: Election Coding for Distributed Learning: Protecting SignSGD against Byzantine Attacks

Neural Information Processing SystemsJan-27-2025, 09:17:39 GMT

Summary and Contributions: This paper addresses the problem of designing first-order optimization methods that are both communication efficient and robust to byzantine workers. In particular, the paper focuses on an existing variant of SignSGD, namely SignSGD with majority voting (SignSGD-MV), which is already communication efficient by design. The paper proposes a new coding theoretic approach to make SignSGD-MV robust to byzantine workers. In a regular SignSGD-MV method, each of the n workers computes a gradient estimate based on the data partition assigned to it and sends a sign of the gradient estimate to the master node. The master node takes the coordinate wise majority of the signed gradient estimates received from all the workers to obtain the final signed gradient estimate.

byzantine worker, gradient estimate, signed gradient estimate, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.39)

Add feedback

Filters

Collaborating Authors

master node

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Distributed Saddle-Point Problems Under Similarity

Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework

3d03b03197666b19c6a6e69812dd3e34-Paper-Conference.pdf

44e65d3e9bc2f88b2b3d566de51a5381-Paper.pdf

Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework

DOGS: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus

Cost-effective Deep Learning Infrastructure with NVIDIA GPU

General Coded Computing: Adversarial Settings

General Coded Computing in a Probabilistic Straggler Regime

Review for NeurIPS paper: Election Coding for Distributed Learning: Protecting SignSGD against Byzantine Attacks